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ABSTRACT 

We combine high-resolution images in four optical/infra-red bands, obtained with the laser guide 
star adaptive optics system on the Keck Telescope and with the Hubble Space Telescope, to study the 
gravitational lens system SDSSJ0737+3216 (lens redshift 0.3223 , source redshift 0.5812 ). We show 
that (under favorable observing conditions) ground-based images are comparable to those obtained 
with HST in terms of precision in the determination of the parameters of both the lens mass distri- 
bution and the background source. We also quantify the systematic errors associated with both the 
incomplete knowledge of the PSF, and the uncertain process of lens galaxy light removal, and find that 
similar accuracy can be achieved with Keck LGSAO as with HST. We then exploit this well-calibrated 
combination of optical and gravitational telescopes to perform a multi-wavelength study of the source 
galaxy at O'.'Ol effective resolution. 

We find the Sersic index to be indicative of a disk-like object, but the measured half-light radius 
(ro =0.59 ± 0.007 Stat ± 0.1 syskpc ) and stellar mass {M* =2.0 ± 1.0 stat ± 0.8 sys x IO^Mq ) place 
it more than three sigma away from the local disk size-mass relation. The SDSSJ0737-I-3216 source 
has the characteristics of the most compact faint blue galaxies studied, and has comparable size and 
mass to dwarf early-type galaxies in the local universe. With the aid of gravitational telescopes to 
measure individual objects' brightness profiles to 10% accuracy, the study of the high-redshift size- 
mass relation may be extended by an order of magnitude or more beyond existing surveys at the 
low-mass end, thus providing a new observational test of galaxy formation models. 
Subject headings: galaxies: fundamental parameters — gravitational lensing — instrumentation: adap- 
tive optics — methods: data analysis — techniques: high angular resolution 



1. INTRODUCTION 

Galaxies do not appear in arbitrary combinations of 
luminosity, mass and shape, but instead obey empiri- 
cal scaling relations (such as the Fundamental Plane for 
early-type galaxies). Explaining the origin, and cosmic 
evolution, of the scaling relations is a fundamental goal 
of galaxy formation theories. 

As far as disk galaxies are concerned, the hierarchi- 
cal structure formation scenario predicts a correlation 
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between size and stellar mass, with width depending 
on t he distribution of the i nitial spin of the dark ha- 
los ()Fall fc Efstathioul [TOSOl ). At any given mass, the 
expected distribution of sizes is well-approximated by 
a log-normal distribution. Qualitatively, this predic- 
tion is quite robust, although the exact forms of the 
correlation and the distribution depend on the details 
of baryonic processes such as energy feedba ck from 
star fo rmation and bu lge instabil i tv fiMo et al.i i l998^ : 
Shenet al. 2003; Tonini et all 120061 : iDutton et al.ll2007t 
Stringer fc Benson ,20071 ). Therefore, measuring the 
shape and width of the correlation provides not only 
a test of the standard paradigm, but also valuable in- 
formation on the poorly-understood baryonic processes 
happening at sub-galactic scales. 

From an empirical point of view, the relation be- 
tween size, luminosity (or equivalently surface bright- 
ness) and stellar mass is well established for disk galaxies 
in th e local Universe {e.g. [Shcn et al. 2003; Driver et al] 
l2005f ). Analysis of suitable objects in the Sloan Digital 
Sky Survey shows that at any given mass (luminosity) 
the distribution of galaxies is indeed well-approximated 
as log-normal, although the scaling with mass of the 
characteristic size and the width of the distribution are 
non-trivial. Defining disk galaxies as those being well- 
fit by a single Se rsic component with index n < 2.5, 
IShen et al] (|2003D find that above a characteristic stel- 
lar mass {logM^,^o/MQ ^ 10.6 corresponding to approx- 
imately Mr,o = —20.5), size scales rapidly with stellar 
mass {R ~ AI^-^^) and the scatter is relatively small 
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(fini? ~ 0.34). Below the characteristic stellar mass the 
correlation flattens (i? ~ M^-^"^) and the scatter increases 
significantly (cini? ~ 0.47). 

At intermediate redshift (0.1 ^ z ^ 1) the natm'e 
and interpretation of the size-luminosity or size-mass 
relation is more u n certain. Several authors (e.g. 
Ferguson et al.1 120041: iBarden et all 120051 iTrujillo et all 



20061 [Melbourne et al.1 l2006f) have used Hubble Space 



Telescope images to determine the sizes of intermediate 
and high (z ^ 1) redshift galaxies, down to the resolu- 
tion and completeness limits of HST (roughly equivalent 
to 1 kpc and IO^^Mq). Recent studies conclude, tak- 
ing selection effects into account, that there is sign ificant 
evolution in the siz e-lumi nosity relation ( Bardcn ~et al.l 
l2005HTruiillo et al.li2006t IMelbourne et al.l l2006). How- 
ever, it is hard to disentangle luminosity evolution from 
size evolution, to ensure that samples at different red- 
shifts are directly comparable, and to compare results 
from different studies, as the selection criteria are often 
similar but not identical {e.g. color vs. morphology; mor- 
phology determined via Sersic index vs. bulge to disk de- 
composition vs. concentration parameter vs. visual clas- 
sification). Overall, it appears that disk galaxy evolu- 
tion cannot be explained by pure luminosity or pure size 
evolution, but requires a combination of both. In con- 
trast, the relation between size and stellar mass appears 
to ha ve changed very little since z '-^ I (jBarden et al.l 
|2005| ). much less than would be expected in the naive 
model where stellar mass and size are proportional to 
the virial mass and radius (and hence size is expected 

to scale as H{z)~3, where H{z) is the Hubble param- 
eter). Rather, galaxies appear to be growing "inside- 
out" in scale radius as their stellar mass increases such 
that the size-mass relat ion is preserved over cosmic 
time (IBarden et al.|[2005f ). It has been suggested that 
galaxy evolution models that take into account the ever- 
increasing concentration of dark matter halos, and the 
further effect of baryons via adiabatic contraction could 



provid e the physics require d to reproduce the observed 
trend (iSomerville et al.|[2006 ) , although this may make it 
more difficult to reproduce simult aneously other scalins 



more dithcuit to reproduce smiult aneously other scaimg 
laws, for example the TuUy-F isher (|Tullv fc Fisheiill977Tl 
relation putton et al.ll2007f) . 

Lower mass (M* <, IO^^Mq) galaxies are even less 
well understood. While the local size-mass relations of 
IShen et all (|2003D for low (n < 2.5) and high {n > 2.5) 
Sersic index objects diverge, the interpretation of Sersic 
index as a morphological galaxy classifier becomes more 
uncertain at lower masses (e.g. Capaccioli ct al. 1992; 
iTruiillo et a"Lll2004f) . At the same time, the measurement 
of the structural parameters themselves becomes harder 
as the galaxy size decreases. Nevertheless, such small 
galaxies are important objects to understa nd: the lumi- 
nous c ompact blue galaxies first noted by iKoo fc KronI 
appear in large numbers a t intermediate red- 
shifts in deep H ST images {e.g. iNoeske et all 120061: 
iRawat et al.ll2007|) . but evolve very rapidly to vanishing 
abundance in the local universe. What becomes of these 
objects, which represent sites of small-scale but vigor- 
ous star format ion, is a topic of some debate, w ith dwarf 
spheroids {e.g. iKoo et al.lll994l : INoeske et a"Lli2006) and 
the bulges of dis k galaxies (e.g. iHammer et al.l 120011 : 
iRawat et al.ll2007l ) the principle candidates. 



Gravitational lensing is a powerful tool with which to 
exten d the inves tigation of scaling laws over cosmic time 
{e.g. [^iul[200l . On the one hand, the lensing geom- 
etry provides a precise and almost model-independent 
measure of total mass of the lens galaxy. Since the lens 
galaxies are mostly early-type galaxies (or the bulges of 
spirals), this gives a new handle on the mass profile of 
these svstems (iTreu fc Koopmansl2004lKoopmans et al.l 
|2006[ ) and hence, for exam ple, on the relation ship be- 
tween stellar and total mass ([Bolton et al.ll2007t ) . On the 
other hand, the background source is typically magnified 
by a factor of ~10, mostly in the form of a stretch along 
the azimuthal direction. While lensing preserves surface 
brightness, the increase in apparent size of the lensed 
source means that the number of pixels at any one sur- 
face brightness also increases, such that the isophotes are 
observed at higher signal-to-noise. Thus, gravitational 
lenses act as natural telescopes, allowing one to gain a 
factor of ^ 10 in sensitivity and spatial resolution, and 
thus improve markedly our ability to study the size and 
dynamical mass (through rotation curves) of intermedi- 
ate and high redshift galaxies. For exampl e, studies o f 
the internal structure of faint blue galaxies (lEllislll997D. 
and i n particular the most compact of these ()Koo et al.l 
I1994 I), are cur rently limited by the resolution of HST 
IPhiUips et al.l ll997) . When magnified by a gravitational 
lens, such objects become well-resolved. Thanks to the 
dedicated efforts of several groups, the number of known 
gravitational lenses is increasing dramatically: it is now 
possible to envision statistical studies of relatively large 
sample of lensing or lensed galaxies in the near future. 

In this paper we present multi-color high-resolution im- 
ages of the gravitat ional lens system SDSSJ0737-I-3216 
(jBolton et all l2006( l. obtained with both the Hubble 
Space Telescope and with the Laser Guide Star Adaptive 
Optics (LGSAO) System on Keck II. The scientific goal 
of the analysis of this case study is two-fold. First, we 
perform a detailed comparison of the results of the lens 
modeling across bands, showing that - when a bright 
nearby star is available for tip-tilt correction and condi- 
tions are favorable - the most important parameters can 
be measured with comparable accuracy with HST and 
Keck-LGSAO. Second, we exploit this particular cosmic 
teles cope to achieve super -resolution of the source galaxy. 
fSee lMcKean et al.|[2007l . for Keck LGSAO observations 
of a lens with a point-like source.) With a lens magni- 
fication of /i ^ 10, the resolution of the HST and Keck 
images (~ O'.'l FWHM) corresponds to a physical scale 
of (0.66kpc//i ~ 0.05kpc) at the redshift of the source Zg 
= 0.5812 , comparable to the resolution attainable from 
the ground when studying galaxies in the Virgo Cluster 
in 1 arcsec seeing. We derive the Sersic index, size, and 
stellar mass of the source, and show that using gravita- 
tional telescopes the size-mass relation may be extended 
by an order of magnitude in size with respect to current 
studies, thus allowing one to probe, for example, whether 
the change in slope and intrinsic scatter below the char- 
acteristic mass persists to higher redshifts. 

This paper is organized as follows. After describing 
the observations in section |21 we outline in sections [51 
and m two sources of systematic error and our strate- 
gies for dealing with them, before explaining our mod- 
eling methodology in section |5l In sections |6| and |7| we 
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present our results, which are then discussed (section [9]) 
before we draw conclusions in section [TOl Throughout 
this paper magnitudes are given in the AB system. We 
assume a concordance cosmology with matter and dark 
energy density f2„i — 0.3, SIa ~ 0.7, and Hubble constant 
Ho=70 kms-^Mpc-^ 

2. DESCRIPTION OF THE OBSERVATIONS 

2.1. mRG2 on Keck 

On December 11, 2006, we imaged SDSSJ0737+3216 
with the LGSAO system on the Keck II telescope. The 
images were taken in the K' -band with the near-infrared 
camera (NIRC2) in the wide field (40" x 40") of view. 
The pixel scale for this configuration is 0.04" pix^^. A 
total of 3120 seconds of exposure was obtained. To avoid 
saturating stars in the field, individual exposures were 1 
minute in duration (divided into two 30-second co-adds). 
A dither was executed after every set of 2 exposures to 
improve the sky sampling. Dithers were randomly cho- 
sen using a script with a circular dither pattern of radius 
3". The laser was positioned at the center of each frame, 
rather than fixed on the central galaxy. Steinbring et 
al. (2007, submitted) demonstrate that this method pro- 
vides a more uniform AO correction over a larger area, 
in comparison with the fixed laser method. Observing 
conditions during the run were good. 

The observations were made as part of the Center for 
Adaptive Optics Treasury Survey (CATS, Larkin et al. 
2007, in prep), which aims to image ~ 1000 distant 
galaxies with Keck adaptive optics. The images were 
pr ocessed with the CATS reduction procedure described 
in [Melbourne et al.l (|2005f ) . A sky frame and a sky flat 
were created from the individual science exposures after 
masking objects. Frames were then flat-fielded and sky- 
subtracted. The images were de- warped to correct for 
known camera distortions. The frames were aligned by 
centroiding on objects in the field, and finally co-added 
to produce the final image. 

The final processed image shows three unsaturated 
stars lying within 10" of the lens position. Two of these 
stars are between the tip-tilt star and the lens. The 
third lies on the opposite side of lens from the tip-tilt 
star. These stars provide a very strong constraint on the 
point-spread- function (PSF), which is often difficult to 
track for AO observations. A further constraint on the 
PSF comes from observations of a PSF star-pair. The 
star-pair observations were made immediately following 
the lens observations. We picked a star pair that had 
a tip-tilt/PSF orientation and separation similar to the 
tip-tilt/lens system. As a result, the lens observation 
has one of the best constrained PSFs ever obtained for 
an extragalactic AO observation. 

A visual inspection of the stars in the field reveals an 
approximate double Gaussian profile, as used in simpl e 
models of adaptive optics PSFs {e.g. iLaw et aT]l2006l ). 
The small-scale component of this profile was observed 
to have a FWHM of w 2.5 pixels, or 0.10 arcsec. This is 
significantly larger than the diffraction limit of Keck in 
the K-band (^ 0.06"), but is primarily the result of using 
the wide-field camera which slightly under-samples the 
PSF. The large-scale Gaussian component has a FWHM 
of w 0.40 arcsec, indicating very good seeing. From this 
simple PSF picture we estimate the Strehl ratio to be 
approximately 0.35 for all the PSF stars, demonstrating 



consistently good AO performance in these observations. 

2.2. ACS/NICMOS on HST 

The lens system was observed with the Advanced Cam- 
era for Surveys (ACS) and with the Near Infrared Cam- 
era and Multi Object Spectrograph (NICMOS) on board 
HST on November 5 2006, as part of iTST program 10494 
(PI: Koopmans). One-orbit integrations were obtained 
through filters F555W (2200s) and F814W (2272s) with 
the Wide Field Camera centering the lens on the WFCl 
aperture, i.e. in the center of the second CCD. Four 
sub-exposures were obtained with a half-integer pixel off- 
set (acs-wf c-dither-box) to ensure proper cosmic ray 
removal and sampling of the point spread function. A 
one orbit integration with the NIC2 camera through fil- 
ter F160W was obtained with NICMOS in multiaccum 
mode for a total exposure time of 2560s. As with ACS, the 
integration was split in four sub-exposures with a semi- 
integer pixel offset to ensure proper cosmic ray/defect 
removal and improve sampling of the point spread func- 
tion. 

The ACS data wer e reduced usin g multidrizzle 
( Koekemoer et al.l |2002|) as described in iGavazzi et al.l 
(|2007t l. The NICMOS data were reduced us- 
ing a se t of IRAF scrip t s bas ed on the dither 
package (iFruch ter fc HooM I2002D . as described in 
iTreu &: Koopma ns (2003)- The output pixel size was set 
to match that of NIRC2 (0'.'0397) to facilitate comparison 
between the HST and reduced NIRC2 images. 

3. PSF CHARACTERIZATION 

In order to predict accurately the data given a model 
lens image, we must convolve it with the point spread 
function (PSF) of the telescope. For the instruments on 
HST the PSF is calculable from the engineering parame- 
ters that characteri ze the optics and det ectors, using the 
TinyTim package (jKrist fc Hooklll997[ ). However, the 
PSF varies over time, both as a result of the "breath- 
ing" of the telescope over the course of an orbit, but also 
monotonically as the system ages: the Tiny Tim approx- 
imation is not always sufficient. 

Somewhat similarly, the PSF derived from first princi- 
ples for an adaptive optics system is the sum of a Moffat 
profile for the seeing disk, and the diffraction pattern due 
to the telescope itself. In practice, the seeing, and the 
Strehl ratio, vary over the course of a set of observations, 
making a priori predictions of the PSF of limited use. 

In principle, one could include some variable parame- 
ters to describe the model PSFs introduced above, and 
then fit for them simultaneously with the lens model pa- 
rameters. We show in Section |6] that there is indeed 
enough information in our data to constrain the PSF, 
thanks to the multiple-images produced by the lens, but 
defer the investigation of model PSF parameters to fur- 
ther work. Here we take a pragmatic approach and use 
nearby unsaturated stars as estimates of the PSF at the 
position of the lens. For the case of SDSSJ0737-h3216 
there are three suitable stars within « 10 arcsec from 

Prior to the taking of these deep images, shallow (420s) inte- 
gratio ns were obtain ed with ACS in both the F435W and F814W 
filters l|Bolton et al.ll2006i) . as part of the initial SLACS snapshot 
program. These data are not used here due to the low signal- 
to-noise and significant cosmic ray contamination, both of which 
prevent detailed study of the faint ring. 
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SDSSJ0737*321B 




Fig. 1.— NIRC2+LGSAO K' -band (left) and HST NICMOS F160W -band (right) images of SDSSJ0737+3216 , showing the stars used 
in the PSF modehng. A further PSF star was observed for the NIRC2+LGSAO analysis. 



the lens; we excised small cutout images of these stars 
from the images from each instrument /filter combina- 
tion. The properties of these stars (henceforth referred 
to as PSFl , PSF2 , and PSF3 ) are given in Tabled In 
addition, for the NIRC2 observations we used a fourth 
star as described in section 12.11 The use of any given 
one of these stellar model PSFs constitutes an assump- 
tion which we can test using a statistical model selection 
procedure we describe below. 



TABLE 1 

Properties of stars used in the PSF characterization. 

J^iens IS THE ANGULAR SEPARATION BETWEEN THE PSF STAR AND 

THE LENS SYSTEM / LASER SPOT CHIP POSITION. SBtt IS THE 
ANGULAR SEPARATION BETWEEN THE PSF STAR AND THE TIP-TILT 
STAR. PSFO WAS OBSERVED ONLY DURING THE AO RUN. 



ID 


RA 


Dec. 










(J2000) 


(J2000) 


(arcsec) 


(arcsec) 


(AB) 


PSFO 


07:03:11.84 


-08:20:51.8 


0.0 


17.8 


15.0 


PSFl 


07:37:28.54 


-1-32:16:10.2 


8.5 


14.7 


18.2 


PSF2 


07:37:28.46 


-f 32:16:12. 3 


6.3 


16.4 


18.1 


PSFS 


07:37:28.51 


-1-32:16:24.2 


5.6 


28.4 


18.3 



This phcnomcnological model has the advantage that 
it takes into account the time-variability of the PSF as 
well as possible, providing a simultaneous estimate of the 
PSF with the actual data. It also takes into account the 
details of the image combination procedure in a natural 
way - whatever was done to the pixels of the lens image 
was also done to the PSF. One disadvantage of our ap- 
proach is the introduction of additional noise - however, 
the stars are significantly brighter than the lens system 
and the pixel noise in the PSF images can, we believe, 
be safely neglected. Three other disadvantages of our ap- 
proach are that the stellar spectra will not exactly match 
the spectra of the lens or source galaxies within a given 
filter, nor will the position of the PSF stars within a pixel 
exactly match the intra-pixel centroiding of the lens or 
source galaxies, and nor will the PSF at the position of 



the stars exactly match that at the lens position. In 
the absence of a suitable interpolation scheme to solve 
these problems, we resign ourselves to having just three 
models to choose from, and attempt to infer the most ap- 
propriate one of the three from the data. Following this 
procedure will give us an indication of the relative im- 
portance of accurately knowing the PSF. In other words, 
the variation of the results as a function of adopted PSF 
will give us an indication of the systematic error intro- 
duced by our approximate PSF. As we will show in the 
next sections the parameters that we are interested in 
are fairly insensitive to the choice of the PSF, and that 
our ignorance of the the PSF is not the dominant source 
of error in our analysis. 

4. LENS GALAXY SUBTRACTION 

As can be seen in Figure [1] the lens galaxy is much 
brighter than the (lensed) source galaxy, and is a signif- 
icant source of contamination at the arc positions. The 
usual approach to this profile is to subtract a smooth 
intensity dis t ributi on fitted to the lens galaxy light. 
iBolton et al.l (|2006f ) found it necessary to use a flexible 
B-spline model, combined with careful manual masking 
of the multiple images, in order to obtain a satisfac- 
tory removal of the lens light. The problem is that it 
is fundamentally very difficult to disentangle the light 
coming from the le ns galaxy from that coming from 
the source. iMoust akas et al] ()2006f ) used the simpler 
elliptically-symmetric Moffat profile; a Sersic profile fit 
could also have been performed. To quantify this source 
of systematic uncertainty we investigate both lens galaxy 
subtraction methods found in the literature, test them as 
best we can using the data, and compare the results in 
terms of relevant lens and source parameters. 

In subtraction scheme sub we used a SExtractor 
segmentation map to mask out the detected pixels asso- 
ciated with the lensed images, and then fitted an ellipti- 
cally symmetric B-spli ne model with two angular modes 
(see the appendix of IBolton et al.l [20061 for details). 
In this scheme, there is a danger that the tangentially- 
stretched images will be truncated, leading to an overly- 



Gravitational lens SDSSJ0737+3216 



5 



compact inferred source. The Moffat profile fit (hence- 
forth referred to as subtractioii schem e msub ) was per- 
formed as in ([Moustakas et al.] 120061 ). with no masking 
of the image. This model has the benefit of being some- 
what more robust, but must be expected to provide a 
much poorer quality of fit, leaving more lens galaxy fiux 
in the residual image and leading to a brighter, larger in- 
ferred source. Based on these considerations, we expect 
that the two schemes will bracket the ideal solution and 
thus provide an estimate of the systematic uncertainty. 
A Sersic profile fit may well provide a better fit to the 
lens galaxy light than the Moffat profile: we use the Mof- 
fat profile in order to make our systematic error estimate 
a conservative one. 

5. LENS MODELING METHODOLOGY 

Modeling of the images of extended sources lensed by 
galaxy-scale lenses has been the subject o f some consider- 
able r e search in the last few years ( see e. g. [ Warren fc Dvd 



20031 iTre u fc Koopmans 2004; D ve fc WarmJ" 



Koopmanai2005; Suvu e t al. 2006; .Brewer fc Lewis! 



2005 



2006 



Barnabe' fc Koopmansll20Ci7f) . The differences between 



these works revolve around the choice of regularization 
scheme for the reconstructed source plane image, while 
the lens models are largely consistent between the meth- 
ods and reflect the simplicity a nd consistency observe d 
in gravitational lens potentials (jKoopmans et al.]l2006f ). 
The regularization is important due to the very large 
numbers of parameters employed to describe the source 
plane intensity. 

In this work, and in a previous article (|Moustakas et al.] 
[200l . we choose to model the source galaxy using 
simply-parametrized elliptically-symmetric Sersic profile 
components. We pursue this approach for two reasons. 
Firstly, images of intermediate and high redshift galax- 
ies very often show morphologies representable by collec- 
tions of simply-parameterized components (bulges, disks, 
star-forming regions etc.). The second reason is that we 
seek a quantitative understanding of galaxy luminosity, 
mass, size and shape as a function of redshift, and this 
is best achieved by analyzing the image data within the 
context of a sensible phenomenological model (the Sersic 
profile) . The resulting inferences will of course be model- 
dependent (by design), and we should expect the corre- 
sponding precision to be high as a result of the addi- 
tional information used in the fit. Most importantly, our 
results will be directly comparable to other photometric 
and morphological studies. After all, a pixel based re- 
construction will have to be fit by a parameterized Sersic 
model in order to derive shape and luminosity parame- 
ters that can be compared with the literature. 

For our lens models we follow previous authors 
and use the singular isothe rmal ellipsoid (SIE) 
mo de l (e.g.jKormann et al.lll994D. A number of authors 
Ce.g..lTreu fc K oopmansI 12004 iRusin fc Kochane^l2005l : 
iKoopmans et a l. 2006) have shown the SIE model to pro- 
vide a very good approximation of the lens potential on 
galaxy scales. The basic lens equations describing the de- 
flectio n of light by this mode l can b e found readily else- 
where 'Korma nn et al.l (e.g . 11994]): [E vans fc Wil kinsonl 
(e.g. [1998); E vans fc Witt! {e.g. [2001); Schneidei] {e.g. 
mm and arc not repeated here. Suffice to say that given 
the deflection angle as a function of lens plane position, 
the corresponding source plane position can be rapidly 



calcu lated, using the formulae in lEvans fc Wittj {e.g. , 
I2OOII ). The price we pay for this high computation speed 
is a significant systematic error in the source parameters 
as inferred through the lens. The intrinsic spread of log- 
arithmic density slopes (where the SIE profile has slope 
m = 1) is approximately 0.1 2, based on the large sam ple 
of strong lenses analysed bv iKoopmans et all ([2006f ): in 
the appendix we show that this gives rise to a fractional 
uncertainty in source size of about 12%, and an error 
in the inferred source magnitude of 0.26. Implementing 
a more flexible lens model would translate this system- 
atic error into a statistical one - while this is beyond the 
scope of this paper we note that a reasonable goal is to 
reduce all other systematic errors to below the level set 
by the lens mass profile. 

Since our source surface brightness distribution is the 
analytic Sersic profile, we can compute the source in- 
tensity at each desired source plane position, and assign 
it to the original image plane pixel value - we do this 
on a twice sub-sampled grid to reduce rounding errors. 
(This simple but e ffective "poor man's r ay-tracing" is 
described further in [Schneider et al1ll9920 . In this way 
a predicted image can be calculated for any given set 
of lens and source parameters. Before comparison with 
the data image we convolve the model image with a PSF 
image (derived from the image of a nearby star, as de- 
scribed in Section [3] above) . With the PSF image grid 
being much smaller than the data image grid the speed 
of the computation is greatly increased. 

The TV-pixel model image dp (x) and data image d are 
compared via the likelihood function: 

Pr(d|x) = 4- cxp (-^ 



2 _ (rfp.i(x)-d.)^ 



where x 
andZL - (27r)^/2nf a. 



(1) 
(2) 
(3) 

This form contains an implicit assumption of uncorre- 
lated Gaussian pixel noise, which is well-justified for the 
background-limited Keck data. When using the HST im- 
ages, we note that the counts are always such that the 
Gaussian approximation to the Poisson distribution is 
always valid, and compute the uncertainties ai from the 
square root of the image itself. We account for the corre- 
lated noise introduced by the drizzling p rocedure by com- 
putin g the equivalent single pixel noise ([Casertano et all 
|2000[ ). essentially by reducing the uncertainties by a fac- 
tor close to the fourth power of the ratio between the 
output and input pixel scales. This has the effect of 
making the reduced chi-squared approximately equal to 
unity in the case of a good fit. In principle one could 
estimate the pixel covariance matrix and use that in the 
calculation of x^, at greater computational expense. We 
leave this to future work, and note that the correlated 
errors are unlikely to affect our statistical error bars by 
more than a factor of two. As we shall see, systematic 
errors are of greater concern. 

Our simple lens model has 5 parameters: position {x 
and y), velocity dispersion ctsie,^^ mass distribution el- 
lipticity (defined as e = (1 — g^)/(l -I- g^) where q is the 

While the strong lensing image separation is a direct measure- 
ment of the mass enclosed by the Einstein radius, when working 
with the SIE model the overall normalisation is more conveniently 
described by the single parameter (Tqie. This has the added bene- 
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axis ratio), and orientation angle. We assign uniform 
prior PDFs on the latter three; for the lens centroid we 
take the center of the lens light as our best guess, and 
assert a Gaussian prior PDF of width one pixel centered 
on this position. Similarly, for the source position we as- 
sign a Gaussian prior PDF of width 0.1 arcsec centered 
on the lens position. (Since we know that the source 
is lensed, and into a almost circular ring at that, we 
know that the source position must be very close to the 
optical axis. The Gaussian prior does allow for puta- 
tive source positions at larger radii, but has the effect of 
sensibly down- weighting those models which are unlikely 
to provide a good fit. The value of 0.1 comes purely 
from experience with looking at lens models and simu- 
lated lenses.) However, we assign uninformative uniform 
priors for the orientation phi, Sersic index n, effective 
radius 6c, and source magnitude (where the logarithmic 
nature of this quantity captures our even greater prior 
ignorance). For the ellipticity we assume the standard 
weak lensing intrinsic ellipticity distribution, a Rayleigh 
distribution of mean 0.25. (Note that the relation be- 
tween the effective radius 9^, effective semi- major axis Oc 
and axis ratio Qs is Oe = acy^, so that our effective radii 
ma y be compared di rectly with the "circularised" radii of 
e.q. lShen et al]|2003 ^. We shaU see in sections lO and [711 
that our choices of prior PDF have very little influence on 
the posterior inferences. These are defined by the joint 
posterior PDF: 



mation: 



Pr(x|d) =^Pr(x|d,H)Pr(d|H). 



(5) 



Pr(x|d, H) 



Pr(d|x,H)Pr(x|H) 
Pr(d|H) 



(4) 



Pr(x|H) is the product of the individual prior PDFs 
referred to above. We sample the unnormalized nu- 
merator of equation [¥] using the m ulti-purpose M arkov 
Chain Monte-Carlo code BayeSys (Skillinsl l200l) . a ro- 
bust package used in a number of other cosmology and 
lensing analyses (e.q lOdman et al.li200l iMarshalfeoOGi : 
[Limousin et al.|[2b06l and Julio et al 2007, in prep.). 

The symbol H in equation |4] represents the set of as- 
sumptions that go into the inference of the parameters 
via the MCMC analysis. Such models can be compared 
quantitatively using the evidence, Pr(d|H). This statis- 
tic is calculated during the initial "burn-in" period of 
the sampler, and, while dominated by the goodness of 
fit, does take into account the different prior PDFs that 
might be employed. For further reading about evidence 
analysis we recommend IMacKavl ()2003l ) . 

In this work, the prior PDFs are kept fixed while dif- 
ferent PSF models and lens galaxy subtraction schemes 
are tried, an approach also followed by Suyu et al (2007, 
in prep.). A simple ranking could be achieved by us- 
ing some different monotonic function of the chi-squared 
statistic; we note here though that the correct weights 
to use when combining parameter estimates from differ- 
ent analyses are exactly the evidence values (provided all 
models are deemed equally probable a priori) . This can 
be seen by marginalizing the parameter posterior PDF 
over the models - each individual model's posterior gets 
multiplied by its (renormalized) evidence during the sum- 
fit of being (more or less) straightforwardly connected to dynam- 
ical mass estimates from spectroscopic velocity dispersions {e.g. 
ITreu fc Koopmans|[2003) 



In practice, one model often has much higher evidence 
than the others on offer, meaning that the sum can be 
approximated by this single term: this is model selection. 

6. LENS MODELING RESULTS 

Figure [2] shows the fits to the four imaging datasets 
introduced above. For subtraction scheme sub (see sec- 
tion m shown in the Figure) the residuals are close to 
zero, with little significant structure in the residual im- 
ages (especially in the infra-red filters). We show the 
results of the statistical model selection analysis in Ta- 
ble [2l for all datasets. 

We find that the different PSF models are easily dif- 
ferentiated (top half of the table), with typical evidence 
ratios of a few tens. This is reflected in the chi-squared 
statistic, which is not surprising given that the parame- 
ter space volumes are identical between the different PSF 
models. The relative evidence is determined almost en- 
tirely by the goodness of fit, which is significantly better 
for PSFO in the case of the NIRC2 data. This may 
be due to the shape of the PSF at the lens being better 
matched by a stellar image at the same position relative 
to the laser spot (which PSFO provides). For the HST 
datasets, the most appropriate PSF star to use varies 
between filters, as we might expect. 

The situation with the lens galaxy subtraction schemes 
is less clear: here the goodness of fit is dominated by the 
lens galaxy model such that we cannot use the evidence 
straightforwardly to select the most appropriate model 
for the source galaxy. The limiting case would be a lens 
galaxy model so flexible that all the flux was subtracted, 
leaving a zero-flux inferred source and a chi-squared of 
zero. What we can take from Table [2] is that the low 
goodness of fit associated with subtraction scheme msub 
indicates that a significant amount of lens galaxy flux 
is being left un-subtracted, a conclusion vindicated by 
inspection of the residual images (not shown). The dif- 
ferent schemes provide us with a rough estimate of the 
contribution of lens galaxy subtraction to our systematic 
error budget. 

A side effect of the domination of the lens galaxy sub- 
traction problem is that the reduced chi-squared values 
from the lens modeling are often not close to unity. How- 
ever, this need not affect our conclusions about the PSF 
model for fixed subtraction scheme: a good PSF is re- 
quired at all 4 image positions, but the galaxy subtrac- 
tion residuals vary between these points. 

Figure [3] shows the 1-d marginalized probability dis- 
tributions for a selection of lens and source model pa- 
rameters, given the NIRC2-I-LGSA0 infra-red imaging 
dataset, in order to illustrate the effect of the differ- 
ent PSF models and the different lens galaxy subtraction 
schemes on the inferences. Similar results were obtained 

We put the reduced chi-squared values in context by com- 
puting the number of sigma, Na, by which the unreduced chi- 
squared deviates from the mean of its distribution. We do this 
using Fisher's approximation, that \/2x^ is Gaussian-distributed 
with mean \/2k — 1 and unit variance, where k is the number of 
degrees of freedom, assumed to be large. 
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Fig. 2. — Data (left panels), predicted data (middle panels) and residual (right panels) for the best-fit lens models. Top row: 
NIRC2-I-LGSAO K' -band data; second row: HST NICMOS data; third row: HST ACS F814W data; bottom row: HST ACS F555W data. 
The critical curve and asteroid caustic of the lens model are overlaid in each case. The optimal PSF model was used for each dataset, and 
the lens galaxy subtraction scheme was sub . The pixel scale is 0.0397 arcsec: all these cutout images are 2.81 arcsec on a side. 
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TABLE 2 

Model selection statistics for each analysis. 

Dataset Subtraction PSF 'nZ Relative 

scheme model evidence 

NIRC2+LGSAO Tab PSFO 1.219 106 LO 

K' sub PSFl 1.220 10.7 0.03 

sub PSF2 1.222 10.8 0.001 

HST NICMOS sub PSFl 0.991 -0.46 1.0 

F160W sub PSF2 0.990 -0.52 193 

sub PSF3 0.991 -0.43 0.2 

HST ACS sub PSFl 6.276 153.3 1.0 

F814W sub PSF2 6.259 153.0 e** 

sub PSF3 6.277 153.4 0.4 

HST ACS sub PSFl 1.083 4.16 1.0 

F555W sub PSF2 1.084 4.18 0.07 

s_ub PSF3 1.084 4.20 0.01 

NIRC2-fLGSAO Tab PSFO 1.219 106 LO 

I£ msub PSFO 2.704 65.7 e-^aoo 

HST NICMOS sub PSF2 0.989 -0.52 1.0 

F160W msub PSF2 2.596 62.3 6""^^°° 

HST ACS sub PSF2 6.259 153 1.0 

F814W msub PSF2 297 2650 e"^^'^'"' 

HST ACS sub PSFl 1.083 4.2 1.0 

F555W msub PSFl 1.666 29.6 e-^^oo 



from the other filters' data, and are not shown here for 
the sake of clarity. 

This figure shows that the choice of PSF model is 
not critical in determining the available accuracy on the 
model parameters: in all cases the parameter estimates 
agree within the statistical precision. The choice of lens 
galaxy subtraction scheme has a more significant effect 
on the model parameters; in particular, the two schemes 
investigated give rise to a difference of ^ 0.2 magnitudes 
in source brightness. 

To marginalize over the range of PSF models one would 
use the relative evidence values to weight the different 
posterior PDFs (as shown in equation [5]); however, since 
the evidence ratios in the top half of Table [2] are typ- 
ically significantly different from unity we approximate 
this procedure by simply selecting the PSF model with 
the highest evidence. For the rest of this paper, we use 
the optimal PSF models for each dataset (from the max- 
imum evidence values given in Table [J) , and assert the 
SExTRACTOR detected object mask subtraction scheme 
suh : the alternative msuh distributions given in Figure |4] 
(and those for the other model parameters) provide esti- 
mates of the systematic errors we expect for each param- 
eter. We now compare parameter estimates in the four 
different filters to compute the properties of the lens and 
the source. 

6.1. Lens properties 

Figure |4] shows the inferred SIE velocity dispersion and 
mass distribution ellipticity for the SDSSJ0737-I-3216 
lens. These parameters (along with the mass orienta- 
tion, not shown) agree reasonably well across the filters, 
as they should given the achromaticity of the lensing ef- 
fect. The largest discrepancies come from the deeper 
HST ACS F814W image. The likehhood function for 
this data is steeper, making it both harder for an MCMC 
sampler to explore the parameter space, and for a simple 
model to provide a good fit. In this case the inferred pa- 
rameter uncertainties should be accepted with caution. 



Still, the inferred SIE velo city dispersion is in good agree- 
ment with that found by iKoopmans et al.l ()2006f ) from 
their shallower HST/ ACS snapshot data. 

We note that an offset of 0.5 km/s in the velocity dis- 
persion is equivalent to one of 3.4 milliarcsec in the Ein- 
stein radius, a fractional error of 0.3%. We assume that 
the reported image platescales are known to better than 
this, but this may not be the case. The truncation of the 
posterior pdf for lens ellipticity is a direct result of our 
assumption of a prior on this parameter that was uniform 
between 0.0 and 0.3. The lack of strong degeneracy be- 
tween ellipticity and any other parameter indicates that 
this truncation is not a problem in this case - but it 
serves as a warning for future analyses. 

7. SOURCE PROPERTIES 

Having calibrated the optics of our cosmic telescope we 
turn our attention to the target of the observation: the 
lensed source at redshift Zs . Figure [5] shows the multi- 
color reconstruction of this object, which shows the pres- 
ence of a red, compact core centered on a more extended 
blue light distribution. The ellipticity and orientation 
of the source are a good match w i th th ose found from 
shallower data bv iKoopmans et "aLl (|2006D . We note that 
the alignment of the different filters' reconstructions is 
very good, and that qualitatively we seem to be recover- 
ing the large-scale stellar component rather than being 
dominated by any smaller-scale features. 

7.1. Source photometry and morphology results 

The top left-hand panel of Figure [6] shows the 2- 
d marginalized probability distributions for two source 
model morphology parameters, the effective radius r^ 
and Sersic index n , given each of the datasets. We again 
note that the precision available for each parameter is 
much higher for the deep HST ACS F814W image, and 
very similar across the other three datasets. Likewise, 
the lower panels in this figure show the inferred source 
orientation and ellipticity, which are reasonably constant 
through the bandpasses. 

We infer a small, compact source galaxy across the 
whole wavelength range. The differences in morphology 
between the filters are not large, but there is a sug- 
gestion that in the redder bands the profile is slightly 
more compact, approaching the Gaussian distribution [n 
~ 0.7). However, the degeneracy between r^ and n can 
be clearly seen, warning us not to over-interpret the in- 
ferences: a robust conclusion is that the inferred Sersic 
index is low in all filters. Likewise, the two different 
linestyle PDFs plotted also showing the effects of the 
different lens galaxy subtraction schemes on the inferred 
source morphology. In particular, the deep HST ACS 
F814W data can be seen, as expected, to be generally 
more systematics-dominated than the other filters', with 
significant (if small) differences in inferred effective ra- 
dius and magnitude between different analyses. It is in 
this filter that the sensitivity to the different model as- 
sumptions is highest, and the limitations of our simply- 
parameterized model are made most clear. 

The photometry is also (unsurprisingly) affected by the 
lens galaxy subtraction: the lens subtraction systematic 
error can be seen in the top right-hand panel of Fig- 
ure [SI In the next section we use the photometry from 
subtraction scheme sub , and return to the systematic 
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Fig. 3. — Marginalized posterior probability distributions for four of the model parameters, given the NIRC2+LGSAO data only. Top 
row: lens SIE velocity dispersion. Second row: lens mass ellipticity. Third row: source AB magnitude. Bottom row: source effective radius. 
Left panels: comparing different PSF models. Right panels: comparing different lens galaxy subtraction schemes {sub dark, msub light). 
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Fig. 4. — Marginalized joint posterior probability distributions, 
given each dataset, for the lens SIE velocity dispersion and mass 
distribution ellipticity. The contours enclose 68% and 95% of the 
integrated probability. Solid curves are for the preferred galaxy 
subtraction scheme sub , while the dashed curves are for the alter- 
native scheme msub . 

error budget in section [51 

7.2. Spectral energy distribution and stellar mass of the 
source galaxy 

Armed with photometry from HST ACS (F555W and 
F814W ), HST NICMOS (F16GW ) and NIRC2+LGSA0 
(K' ), we now reconstruct the spectral energy distribution 
(SED) of the source. To account for uncertainty in the 
zero points and filter transmission curves, we assert sta- 
tistical errors of 0.1 and 0.05 respectively and add these 
in quadrature to the statistical errors from the MCMC 
inferences. 

Given the known redshift, we estimate the stellar 
mass by fitting the ob served colors to a variety of 
SED templates (jBundv et al. 2006). The best fit model 
is obtained for a exponential star-formation rate with 
short characteristic timescale t ~ 0.04 Gyr and young 
age ( ;S 0.7 Gyr), and corresponds to a stellar mass 
of logioMVMg = (9.3 ± 0.2 Stat ± 0.17sys ) assum- 
ing Chabrier IMF, where the error bar is obtained by 
marginalizing the posterior over the stellar populations' 
parameters. In other words, the system appears to have 
undergone a very recent burst of star formation, consis- 
tent with its selection via emission lines. This inferred 
star formation history is consistent with the SED-fitting 
performed by IGuzman et al.l (|2003f ) on a sample of lu- 
minous compact blu e galaxies taken from the sample of 
iPhillips et all (fl997h . 

Figure [7| shows the fluxes (and uncertainties) used in 
the fit plotted as a function of wavelength, with the best- 
fitting galaxy template — normalized to the observed K' 
luminosity — overlaid. For reference, the absolute AB 
magnitude in the F555W -band is —19.66 ± 0.05. We 
note that choice of IMF is the sing le largest source o f 
systematic uncertainty (0.2-0.3 dex, iBundv et al.ll200^ 
in the absolute stellar mass. However, when comparing 
stellar masses with other surveys we must compute the 
same model-dependent masses. Both lShen et~al1 (|200l 



and iBarden et all (|2005f) assume a Kroupa IMF, which 
results in stellar masses different from those assuming a 
Chabrier IMF by just 0.05 dex. 

Likewise, we note that the stellar masses of less well- 
resolved galaxies in the literature typically also come 
from a global modeling of the object photometry (rather 
than a joint morphological and photometric analysis), 
justifying our approach to modelling the SED here. The 
Sersic indices measured in section FtHI are also sufficiently 
similar to justify the assumption of a single stellar pop- 
ulation when estimating the stellar mass. We do not, 
in any case, expect the systematic error in the absolute 
stellar mass introduced to be greater than that from the 
IMF uncertainty. Furthermore, the consistency between 
the filters (all the way out to the K' -band ) suggests 
that we are not dominated by small-scale star-forming 
regions in either the mass or size measurements. 

8. SYSTEMATIC ERRORS 

Photometry with AO imaging has the reputation of 
being at best difficult and at worst inaccurate. In this 
work we have looked carefully at several systematic er- 
rors associated with photometric and morphological of 
small extended objects viewed through galaxy-scale grav- 
itational lenses, and now discuss these errors in a little 
more detail. 

8.1. Model- dependent LGSAO photometry 

The basic problem of measuring the total fiux of an 
object, and the radius within which half of this total fiux 
is contained, is partially solved by the assumption of a 
sensible model intensity distribution, allowing the light 
profile to be extrapolated beyond the data region. This 
solution is of course only as good as the model assump- 
tion, but at least leads to a set of well-defined quantities 
{e.g. "Sersic magnitudes"). The underlying assumption 
is that high resolution imaging data provides enough con- 
straints on the inner part of the profile that the extrap- 
olated quantities can be accurately inferred. 

One could argue that imposing a model in this way 
"biases" the results - distant galaxies are not necessarily 
expected to have pure Sersic profiles. The system stud- 
ied here at least appears to be simple, in that a single 
image component provides a reasonable fit in the infra- 
red, but there are suggestions in the bluer filters that 
the galaxy has a more complex morphology. This is 
perhaps to be expected given that this system was se- 
lected for its emission line spectrum, indicating ongoing 
star-formation and consequent likely clumpy morphol- 
ogy. However, if we are to quantify galaxies like the 
source behind SDSSJ0737-I-3216 in a way that permits 
comparison with other datasets and/or with a physical 
theory then the Sersic profile appears to be the most nat- 
ural choice, given its widespread use. The galaxy itself 
may not be well-fit by a Sersic profile - but that does not 
mean that knowing its Sersic parameters is not useful. 
The fitting of a lensed Sersic profile is an appropriate 
way of measuring the average properties of the source 
light distribution, even in the blue filters. We note that 
the residual features in the bluer images are smaller still 
than the inferred Sersic component, suggesting that we 
are measuring the principal stellar structure, and not a 
smaller, brighter, peripheral star-forming region, even at 
the shorter wavelengths. 



Gravitational lens SDSSJ0737+3216 



11 




Fig. 5. — Multi-filter reconstruction of the source behind SDSSJ0737+3216 . From left to right we plot: data, predicted data, residual 
and reconstructed source plane images for the best-fit lens models, assuming optimal PSF model and lens galaxy subtraction scheme sub . 
Note the resolution of a red, compact core centered on a more extended blue light distribution. The red, green and blue image channels are 
given by the K' -band , F814W -band and F555W -band images respectively, and the relative scales were chosen (manually) to equilibrate 
the noise levels across the channels. 
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Fig. 6. — Marginalized posterior probability distributions for pairs of source model parameters, given each dataset. Top left: effective 
radius and Sersic index n; Top right: effective radius and AB magnitude; Bottom left: effective radius re and orientation angle (/>; 
Bottom right: effective radius rc and ellipticity e. The contours enclose 68% and 95% of the integrated probability. Solid curves are for 
the preferred galaxy subtraction scheme sub , while the dashed curves are for the alternative scheme msub . 



8.2. PSF model selection and truncation 

Assuming a model galaxy profile, and having 4 pre- 
dictable copies of the same image, means that the PSF 
structure can be inferred concurrently with the source 



itself. Indeed, we have shown that PSF selection via 
the Bayesian evidence is possible: there is information in 
all the imaging data analyzed on the most appropriate 
PSF. We noted that, since the number of model parame- 
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Fig. 7. — Reconstructed source SED. The solid curve is the best- 
fitting spectrum normalized to the luminosity inferred in the K' 
filter; the error bars on the flux points show the statistical errors 
assumed in the fit. In the background we show the filter transmis- 
sion curves (from blue to infra-red: F555W , F814W , F160W , K' 
)■ 

ters is unchanged between the different PSF models, the 
evidence is being dominated by the goodness of fit. How- 
ever, the PSF suitability is related to the choice of source 
galaxy model and its parameter prior PDFs. This leads 
to the evidence being a sharper tool for PSF selection 
than the reduced chi-squared values, as can be seen in 
Table [2l In the case studied here, the PSF selection is 
interesting but not critical, as we clearly see that there 
are larger systematic effects at play. 

Our treatment of the PSF with a small cutout star 
image is cause for more concern. Our (internally- 
normalized) PSF postage stamps, at 16 pixels width, 
only span 1.5 times the seeing disk FWHM (« 0.4 arcsec 
). To quantify the effect of this on the inferred model 
parameters, we simulated NIRC2-I-LGSA0 observations 
of a gravitational lens having the same properties as 
SDSSJ0737-f 3216 {i.e. the parameter values found in sec- 
tions [6T] and [7T]). For the PSF we used a concentric sum 
of two Gaussians (representing the seeing disk wings and 
the Airy pattern core, with relative weig hts given by the 
Strehl ratio), following iLaw et"al] (120061 ). The simulated 
data were generated with a large 72-pixel PSF cutout, 
while the MCMC sampler was provided with a poste- 
rior PDF assuming a small, renormalized 16-pixel PSF 
cutout. Investigating input Strehl ratios of 0.2, 0.3, and 
0.4 (and assuming FWHM values of 0.10 and 0.40 arcsec 
for the K' -band Keck diffraction pattern core and seeing 
disk respectively) , we found that from a choice of model 
double Gaussian PSFs with the true seeing and core size 
and Strehl ratios of 0.2, 0.3, and 0.4 the evidence se- 
lected the correct (input) PSF each time, by about the 
same margin as seen with the real data. Using the max- 
imal evidence PSF, we then found that the magnitude 
of the source was underestimated by 0.03 mags compa- 
rable to the statistical uncertainty. This is significantly 
smaller than the other estimated errors (that were used 
in the stellar mass calculation), but comparable to the 
error introduced by the lens galaxy subtraction. The ef- 
fective radius was found to be over-estimated by 0.005 
arcsec, a small but statistically significant increase; the 
Sersic index was also overestimated by 0.1 or so. These 
shifts, while contributing to the overall systematic error 



budget, do not affect our conclusion about the unusual 
size of this source, which we discuss below. 

We conclude that LGSAO photometry of faint extra- 
galactic extended sources at the 0.05 magnitude accuracy 
level (not including zero point and filter curve calibra- 
tion) is perfectly possible using techniques such as those 
used in this work. However, we caution that the con- 
ditions of observations were exceptionally good, both in 
terms of seeing and stability of the PSF. This is sup- 
ported by the fact that the specially-observed star PSFO 
gave the best results, and by the consistency between the 
results obtained with this star and with the serendipitous 
stars observed in the object field itself. This consistency 
is not guaranteed in general, since the PSF can be ex- 
pected to change significantly on timescales like the time 
interval between our observations of the lens field and of 
PSFO (J. Graham, priv. comm.), and that spatial varia- 
tions of the PSF can be significant (Steinbring et al, in 
prep.). However, it bodes well for the future that our 
results would be essentially unchanged had we only used 
the stars in the field of the target. 

8.3. Overall systematic error budget 

In sections 16.11 and 17.11 we identified the lens galaxy 
subtraction as a serious issue leading to the dominant 
systematic error when inferring the model parameters 
from well-calibrated data and assuming an isothermal 
density profile lens. A better approach would be to fit 
the lens and source simultaneously, making use of the 
quadruple-imaging to constrain the two intensity distri- 
butions with minimal degeneracy. From the Moffat pro- 
file fit residuals (scheme msub , Tabled]) we see that such 
a procedure would require a flexible model (such as the 
B-splines used here) for the lens galaxy light, in order 
to get a good fit. This is not computationally feasible 
within the current framewor k, but should be possib le in 
the semi-linear formalisms of I Warren fc Dvj (|2003f l and 
others. 

Comparison of the parameter estimates between sub- 
traction schemes does give a quantitative feel for the sys- 
tematic errors introduced by the lens galaxy subtraction. 
These are compared with the other errors identified in 
this work in Table [3l We see that even the largest im- 
age analysis systematic error, that due to the lens galaxy 
subtraction, is still smaller than that introduced by the 
assumption of an isothermal density profile lens mass dis- 
tribution. Conservatively combining all the systematic 
errors by simple addition, the resultant systematic er- 
rors on the size and stellar mass are approximately 0.1 
kpc and 0.8 x lO^M©; these may be compared with the 
statistical uncertainties shown in Figure [71 

For the Sersic index we read off the systematic errors 
from Figure [S] as 0.2 for the optical filters, and 0.1 for the 
infra-red filters, and assume that this is unaffected by the 
lens density profile (which simply changes the magnifica- 
tion of the source). 

9. DISCUSSION: THE SIZE-MASS RELATION AT Z = 0.6 

From the analysis presented above we obtain the final 
result that log^o r^/kpc = (-0.23 ± 0.005 stat ± 0.07 sys 
), logioA^VM© = (9.3 ± 0.2 Stat ± 0.17 sys ) and n 
= (1.0 ± 0.1 stat ± 0.2 sys) for the size, stellar mass, 
and Sersic index of the SDSSJ0737+3216 source galaxy, 
where these numbers are global estimates based on all 
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Fig. 8. — Size-mass relations for galaxies selected by their measured Scrsic index. We plot the (relation between the) stellar mass 
and effective ra dius for gal axies in the local Universe (left, SDSS, Shen et al. 2003) and at z 0.6 (right, GEMS, [Harden et al. 200^; 
IMcIntosh eral|[2005,: ,Somerville et al.|[20M) . "Disk-like" (n < 2.5) galaxies arc shown in blue, and "bulge-like" (n > 2.5) galaxies are 
shown in red. In the left-hand plot we show the morpholgically-selected dwarf early-type galaxies in the Virgo cluster, again divided by 
Sersic index. The black pentagonal point shows the source behind SDSSJ0737+3216 . The mass plotted here assumes a Kroupa IMF. 

since redshift 0.6, which appears unlikely given the very 
modest evolu tion observed at masses above 10^° Mp,: 
| Barden et al. 



TABLE 3 

Summary of systematic errors identified in the text, and 
their effects on the principal source parameters. 



Description 


(5re 


<5mAB 


51ogio A//*/Mq 




(kpc) 




(dex) 


Incorrect stellar PSF model 


0.01 


0.01 


0.005 


Truncated PSF image 


0.005 


0.03 


0.015 


Lens galaxy subtraction 


0.01 


0.10 


0.04 


Stellar mass IMF choice 


N/A 


N/A 


0.05 


Lens model density slope 


0.07 


0.26 


0.11 


Total (approximate) 


0.10 


0.40 


0.17 



.Truii llo et al 



2005) find a constant size-mass relation; 



2006^ measure To 



1 



-0.4±0.06 



for 



the filters' data: the plots in Figure [6l and the sys- 
tematic error analysis of the previous section indicate 
that the small differences between the different bands 
are not significant. We overlay these values on 

the local relation for "disk" galaxies (i.e. those with 
Sersic index n < 2.5) derived from SDSS bv IShen et al.l 
()2003f ). and the corres ponding z » 0.6 relation derived 
from GEMS bviBarden et al.l (|2005f l (and interpreted by 
ISomerville et al.ll2006f) in Figure [51 For its stellar mass 
(which is a factor of 5 smaller than the GEMS complete- 
ness limit), the source behind SDSSJ0737-)-3216 appears 
to be a bout a factor of 3 smaller than the average local 
galaxy (|Shen et al.l 12003 1. putting it approximately 3-cr 
below the local size-mass relation for galaxies with low 
Sersic index. 

The distant GEMS data do not extend to small enough 
masses to allow for a direct comparison with our mea- 
surement. However, bringing our point into agree- 
ment with the typical z — 0.6 galaxy from GEMS 
would require a somewhat marked flattening of the 
size-magnitude relation at masses lower than lO^^M© 

^"^ We note that the SDSS size estimates were made in the 
r-band, such that the effective rest-frame wavelength is about 
5700 angstroms. At redshift 0.6, this falls in the i-band, as used 
(approximately) in GEMS. Our corresponding F814W -band mea- 
surement is the one most affected by systematic errors - however, 
the global size estimate we use can be seen (Figure |6ll to be repre- 
sentative of the size in this filter. 



disk-like galaxies, predicting that the mean object at 
z « 0.58 is only about 0.83 times the size of the mean lo- 
cal disk-like galaxy. Even comparing with the incomplete 
GEMS data at - lO^Mp (jBarden et al.l[2005L Figure 10) 
we see that our object is unusually small. Thus we con- 
clude that the source galaxy behind SDSSJ0737-I-3216 is, 
relative to existing surveys, somewhat extreme in terms 
of mass and size, if it is indeed a disk galaxy. 

Such c ompact galaxies have, however, been well- 
studied. iKoo et all ()1994l ) identified a sample of com- 
pact, narrow emission- line galaxies in the Hubble Deep 
field at redshift w 0.2, having luminosities {Mb ~ —19) 
and sizes {r^ « 1 kpc). These objects comprise a small 
fra ction of the ubiquitous faint blue galaxies reviewed 
bv lEllisI (|1997f ). Extending the sample to intermediate 
redshift and focusing on the higher luminosity members 
{Mb « -21),|Koo ct al. (1995) found using high resolu- 
tion spectroscopy that these objects appear more similar 
to local HH galaxies (dwarf galaxies showing violent star 
formation activity), and suggested that these systems 
will evolve into today's dwarf spheroids. This conclusion 
was also reached by Phillips ct al. ( 1997) in an extension 
of that work, although they note that the number densi- 
ties are such that not all compact galaxies at intermedi- 
ate redshift ca n be p rogenitors of spheroids. However, 
iHam mer et al.l (|2001[ ) argue that the observed narrow 
emission line widths may not represent the depth of the 
whole galaxy potential, and instead argue on the basis of 
the stellar masses they infer from their spectra and pho- 
tometry that the more luminous, compact galaxies are 
more likely the progenitors of the bulges of present-day 
massive spiral galaxies. At the stellar mass scale inferred 
in this work (lO^-^), the source behind SDSSJ0737+3216 
appears closer to the low-luminosity end of the samples 
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of iKoo et all (|1995D and lPhillips et all (|1997D . but is al- 
most a factor of two smaller than the limiting size of 
their sample (0'.'16) and is fully resolved. Indeed, our 
physical resolution is comparable to that reachable for 
galaxies in the Virgo cluster with ground-based seeing; 
we find that the SDSSJ0737-I-3216 source is compara- 
ble to the smallest dwarf ell i pticals seen in the V irgo 
cluster ()Capaccioli et al.|[l992l: iFerrarese et al.ll2006l Fig- 
urelH]), and is typical of the objects in the smallest size bin 
of the S ersic -selected "ellipti cal" galaxies in the GEMS 
survey ()McIntosh et al.ll2005[ ). 

Can we interpret our super-resolved source morphol- 
ogy as being that of a forming spheroid? The low Sersic 
index measured would suggest not, placing it firmly in 
the disk-like samples of the literature. However, at 
low masses there is evidence that elliptica l galaxies can 
have Sersic indices of one and below (e.g. iTruiillo et al.l 
|2004[) . The consistency in morphology between the ob- 
servation filters is indicative of a regular spheroid (al- 
though the residual structure in the bluer filters would 
argue against a highly-evolved, smooth stellar distribu- 
tion. Perhaps the strongest indicator is the position of 
the source in the size-mass plane. In Figure [8] we show 
the local, and z = 0.6, size-mass relations for "elliptical" 
(n > 2.5) galaxies from SDSS and GEMS llShen et all 
120031; [Mcintosh et al. 2005), and can see that the source 
behind SDSSJ0737-I-3216 sits rather more comfortably 
with these relations, albeit at significantly (a factor of 2 
- 4) lower mass. 

Our results demonstrate that, using a gravitational 
telescope to super-resolve the source, it is possible to 
study in considerable detail atypical sources that may 
well be missed or excluded by non-lensed surveys. In 
fact, size-mass studies at redshift 0.5 and above have 
necessarily focused on the high luminosity end, and have 
inevitably included a size cut to remove stars from the 
catalogs, before a further completeness cut that discards 
the least massive galaxies. It is not clear just how many 
small galaxies are being overlooked in this way. The 
higher resolution afforded by our gravitational telescope 
allows us to study the structure and surface brightness 
profiles of the compact blue galaxies in much greater de- 
tail and with higher precision, and to extend the inves- 
tigation to smaller sizes still. For comparison, the total 
magnification provided by SDSSJ0737-I-3216 is fi fa 13, 
indicating an angular resolution in the source plane of 
approximately 0.01 arcsec. The 10% accuracy we obtain 
on our size measurement indicates that we are still some 
way from the limit imposed by the resolution of our op- 
tics. Indeed we note that this accuracy can be improved 
by a further factor of 2 by simply using a more flexible 
lens model. 

Having demonstrated the power of this method, a 
larger sample of objects is needed in order to infer statis- 
tically meaningful conclusions about the low mass/size 
tail of the mass-siz e relation. As clearly discussed by 
iBarden et al.l (2005) , to achieve this goal it is crucial to 
understand the selection function of the objects being 
used in the size-mass relation study. Due to a form of the 
so-called magnification bias, gravitational lens surveys 
such as SLAGS tend to favor compact sources. SLAGS 
lenses were selected from (spectroscopic) observations 
where the system is essentially unresolved, meaning that 
lens systems with high total magnification are preferen- 



tially detected. This bias is strongest when the source 
is point-like, i.e. much smaller than the size of the fiber 
and the Einstein Radius (« 1 arcsec). Thus, it is not so 
surprising that the first source to be studied is compact. 
This realization implies that when performing statisti- 
cal analyses of the size-mass relation of lensed galaxies it 
will be necessary to use Monte Garlo simulations to un- 
derstand and quantify the selection function of multiple 
image systems. Applying our methodology to other lens 
systems, will, once the selection effects are quantified, 
extend this study to join the existing statistical analy- 
ses of higher-mass disks, to probe the small size {i.e. low 
angular momentum) regime. 

The SLAGS lenses are well-suited to this task: the 
efficiency of the survey is such that some 100 high- 
magnification systems like SDSSJ0737-I-3216 are ex- 
pected to be found by the end of the program. (The 
number of systems currently confirmed using high reso- 
lution imaging is already close to this figure.) Extend- 
ing the study to sources at even higher redshifts requires 
more lenses t o be discovered at gr eater distances: the 
SL2S survey (|Gabanac et al.l [2007D is expected to dis- 
cover ^ 100 suitable systems, with sources at redshifts 
of 1.0 and higher. However, the detection of these sys- 
tems (via a ground-based imaging survey in the r'- and 
i'-bands) will lead to different selection effects than those 
present in the SLAGS survey, and will again require 
Monte Garlo simulations in order to understand them. 

The different identification schemes of the SLAGS and 
SL2S survey introduce also a different selection effect in 
terms of stellar population, which will have to be mod- 
eled and taken into account when interpreting the re- 
sults. SLAGS sources are emission line selected and will 
therefore be representative of actively star-forming galax- 
ies, directly comparable with galaxies selected in narrow 
band surveys. SL2S sources are continuum-selected and 
contain a mix of actively star- forming, post-starburst and 
quiescent stellar populations, directly comparable with 
the galaxy population studied by wide field HST surveys 
in similar broad bandpasses. 

An additional implication of the Icnsing selection effect 
is that the magnification bias in some sense increases the 
power of a galaxy survey, by picking out the smallest 
objects and then making them measurable. With current 
technology, gravitational telescopes are the only way of 
accurately measuring such tiny objects. 

10. CONCLUSIONS 

We find that high quality images from 
NIRG2-I-LGSA0 are capable of providing very similar 
precision on simple lens and source model parameters 
to typical datasets from HST AGS and HST NIGMOS 
. The data themselves contain information about the 
most appropriate PSF model to use, to the extent 
that a set of nearby unsaturated stars can be fruitfully 
compared using suitable statistics that are sensitive 
to the goodness-of-fit. We estimate that even for the 
LGSAO imaging this way of modeling the PSF allows 
a photometric precision of 0.05 mag. However, the 
calibration of isothermal However, the calibration of 
isothermal galaxy-scale gravitational lenses as cosmic 
telescopes is very likely limited by the subtraction of 
the lens galaxy light. We estimate that this procedure 
introduces up to 0.1 magnitudes of systematic error into 
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the source galaxy photometry. However, this is still 
smaller than the error introduced by the assumption of 
an isothermal density profile for the lens itself. 

With this in mind we draw the following conclusions 
about the source behind SDSSJ0737+3216 : 

• Our photometry is robust enough to permit a re- 
construction of the SED, and we find a stellar mass 
of (2.0 ± 1.0 Stat ±0.8 sysx lO^M© ). This is a fac- 
tor of 5 smaller than the complete ness limi t of th e 
GEMS disk galaxy analysis of Bar den et all (|2005( ) . 
and also smaller than th e least massive sphero id at 
this redshift studied by (|McIntosh et aLlboOSD . 

• The Sersic profile parameters of the source can be 
measured to high accuracy. We find an effective 
radius of (0.59 ± 0.007 stat ± O.lsyskpc ) (« 0.09 
arcsec with ~ 10% accuracy), and a Sersic index 
of (1.0 ± 0.1 Stat ± 0.2 sys) in the F814W -band (- 
rest-frame B), and that these values change little 
over the rest-frame optical range. 

• This very small galaxy lies approximately 3-sigma 
below the local size-mass relation for disks. How- 
ever, it shares the properties of the sma llest of the 
compa ct narrow emission line galaxies of lKoo et al.l 
()1994D . and, despite its low Sersic index, is more 
typical of the dwarf e arly-type galaxies obs erved 
in the Virgo cluster (Ferrarese et al. 200( ^ and 
the "e lliptical" galaxies studied bv , Mcintosh et al.l 
(|2005f ) at high redshift. 

While the planned statistical analysis of a large sample 
of lensed galaxies will rely on the detailed understanding 
of the selection function, it is clear that the magnifying 
effect of gravitational lenses allows us to extend current 
size-mass studies to smaller sizes and lower masses than 
would otherwise be available, posing fresh challenges to 
models of galaxy formation and evolution. 
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APPENDIX 

THE EFFECT OF THE LENS MASS DENSITY SLOPE ON THE INFERRED SOURCE SIZE AND MAGNITUDE 

The local magnifying and distorting effect of a gravitational lens {seee.g. ISchneidei1l2006[ ) can be summarized by the 
(inverse) amplification matrix, A~^: 

^-1 ^ f 1- n + 







1 



7 



(Al) 



where n and 7 are two combinations of the spatial second derivatives of the projected gravitational potential ~ k is 
proportional to the projected (surface) mass density ~ in a Cartesian coordinate system aligned with the radial and 
tangential directions. To first order, a source of width dx and length dy (also aligned with these axes) is distorted into 
an image of width dxi and dyi according to 



dx 
dy 



dXi 
dyi 



(A2) 



The All component describes the radial stretching of the source, while the A22 component describes the tangential 
stretching. The factor by which the solid angle subtended is increased due to the lensing effect is the magnification 
M=|A| = l/|A-i|. ^ 

In terms of the Einstein radius (^e, the radius at which the magnification is formally infinite), the quantities k and 
7 are given by 

= ^ = TUj ■ (^'^ 

for a simple spherically-symmetric lens with power-law density profile, for a simple, spherically-symmetric, power-law 
density profile lens (with logarithmic slope to). Two images form at positions 9± that solve the lens equation, 



P = 0± — a{6±), where, in this case, a{0±) — [k{9±) + 7(6'±)] 6±. 



(A4) 
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II the source position f3 << 6±, as is the case when the images are highly magnified and are close to forming an 
Einstein ring, we find the images at 

0± w 0e(1 ± e) where e = (A5) 

mt/E 

The offset e is well-constrained by the data, and so we proceed treating e as a small (<< 1) constant. At this point 
we note that the image positions and distortions do contain some information on the density slope to, allowing this 
parameter to be fitted. What we are working towards here is a quantification of the effect of perturbing the slope to 
away from the isothermal value (to = 1). 

Evaluating k and 7 at the image postions, substituting into equation lAll and expanding to first order in e we find 
that 

to(1 ± e ^ me) 
±TOe 



(A6) 



and that the inverse magnification is (also to first order) /i^^ « zLm^e. 

We can now use this result to estimate the uncertainty on the inferred source size (denoted by arj given by a 
systematic error in the model slope to. We first note that the inferred source plane solid angle is given by 

n ^ dx ■ dy ^ n±fi^^{m), (A7) 

where f2± is the solid angle subtended by each image, and fl r^. A small change in the density slope away from a 
fiducial value of 1 gives rise to an error in source area according to 



(7i1 = Q± 



dm 



cr,„, (A8) 

m— 1 



such that ^ = ^ 



m— 1 



CTm. (A9) 



From this, and the result above, we get that 



^«2a^, (AlO) 

and so — - ~ a„i. (All) 

Since gravitational lensing conserves surface brightness, the inferred source fiux is simply proportional to the inferred 
source solid angle Q: consequently, the error in the AB magnitude due to uncertainty in the density profile slope is 
= (2.5/log„1 0)(an/l]) « 2.2a™. 
iKoopmans et al.l (|2006[ ) give am = 0.12 for the intrinsic spread of power-law indices, where the profile is constrained 
at two radii, the Einst ein radius and the (smaller) effective radius. (Note that this small scatter was not appreciated 
by e.g. iKnudson et al.l (2001) in their analysis of magnification errors.) While the power law index they quote is not 
quite the local slope at the Einstein radius that we require here, the range of radii they consider brackets the Einstein 
radius of SDSSJ0737-t-3216 and therefore their value for am, provides an approximate quantification of the size of the 
density slope systematic error. 



